Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)
Upload

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: Administrator, Curator, or Staffbrewster Date: May 4, 2008 7:39am
Forum: texts Subject: Re: Is the archive "dark web" to Google


All search engines are welcome to and do index all the books in the Internet Archive. All the metadata is harvestable in multiple ways.

The text inside the books are also available for indexing in most circumstances. The digital books sponsored by Microsoft, however, come with the "no commercial services" restriction, so the inside text is not available to commercial robot crawling.

Bulk access to the books is encouraged.

I hope this is clear.

-brewster

Reply to this post
Reply [edit]

Poster: stbalbach Date: May 4, 2008 10:53am
Forum: texts Subject: Re: Is the archive "dark web" to Google

I did some tests and found Google does not index the full text of books on IA, at least the ones I tested. Metadata yes, but not the full text. Even for non-Microsoft books. My tests are only a few and anecdotal and may not be representative but just passing on my findings, I can post more detailed examples if anyone wants to look into it.

Reply to this post
Reply [edit]

Poster: Administrator, Curator, or Staffbrewster Date: May 4, 2008 2:37pm
Forum: texts Subject: Re: Is the archive "dark web" to Google

that is odd since we put a link on the book display page specifically because a google representative kept telling the press they had a hard time crawling these books.

hopefully they get better at it.

-brewster